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Acronyms 


¢ Combinatorial logic (CL) ¢ Probability of configuration upsets 

* Commercial off the shelf (COTS) (P configuration) 

* Complementary metal-oxide ¢ Probability of Functional Logic upsets 
semiconductor (CMOS) (PiunctionalLogic) 

* Device under test (DUT) ¢ Probability of single event functional interrupt 

¢ Edge-triggered flip-flops (DFFs) (Pseri) 


* Probability of system failure (P.,<tem) 

e Processor (PC) 

¢ Radiation Effects and Analysis Group (REAG) 
¢ Reliability over time (R(t)) 

¢ Reliability over fluence (R(®)) 

¢ Single event effect (SEE) 

¢ Single event functional interrupt (SEFI) 
¢ Single event latch-up (SEL) 

¢ Single event transient (SET) 

¢ Single event upset (SEU) 

¢ Single event upset cross-section (O<-y) 


¢ Error rate (A) 

¢ Error rate per bit(A,;,) 

- Error rate per system(A,, com) 

¢ Field programmable gate array (FPGA) 

¢ Global triple modular redundancy (GTMR) 
¢ Hardware description language (HDL) 

e Input — output (I/O) 

¢ Intellectual Property (IP) 

e Linear energy transfer (LET) 

e Mean fluence to failure (MFTF) 

¢ Mean time to failure (MTTF) 

- Number of used bits (#Usedbits) * System on a chip (SoC) 

* Operational frequency (fs) ¢ Xilinx Virtex 5 field programmable gate array 


(V5) 
e Personal Computer (PC 
P (PC) ¢ Xilinx Virtex 5 field programmable gate array 


radiation hardened (V5QV) 
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Problem Statement 


Conventional methods of 
applying single event upset 
(SEU) data to complex systems 
need improvement. 


The problem boils down to 
extrapolation and application of 
SEU data to characterize system 
performance in radiation 
environments. 
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Abstract -— Impact to Community sal 


e Weare investigating the application of classical reliability 
performance metrics combined with standard SEU analysis data. 

e We expect to relate SEU behavior to system performance 
requirements... 


— Should we characterize systems by upset rates? Is that sufficient? What 
does it even mean? 


— Our proposed methodology will provide better prediction of SEU 
responses in harsh radiation environments. 
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e When a system is targeted for 
space, single event effect 
(SEE) data is obtained for all 
devices that make up that 
system. 


¢ Combining all the data is not 
simple addition. 


¢ Co-dependent susceptibilities 
exist and must be handled 
accordingly. 

e The scope of this presentation 
will be System on a Chip (SoC) af wi \f ‘4 
field programmable gate array iY 
analysis. om 

e Future presentations will 
expand to address Systems at 
the box level. 
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Background 
FPGA SEU Susceptibility 


Measured in SEU Cross Section (0.-y) 
° Ocpys (per category) are calculated from SEU test and analysis. 


S 


° Ocpys are calculated with particles that vary in linear energy 
transfer (LET). 


e FPGA architectures vary and so do their SEU responses. 
¢ Most believe the dominant o,,,s are per bit (configuration or 
functional logic). However, global routes are also significant. 


Ospys are measured Osrys are measured 


by bit by bit 
P (fs ) system oc £ Configuration ae (fs ) functionalLogic + £ SEPT 
Design Oseyu Configuration Oscy Functional logic SEFI Oscey 


i 


OseEu 


Sequential and 
Combinatorial Global Routes 


Fora system, should o,,,s be 
measured by bit???? 
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logic (CL) in and Hidden 
data path Logic 


Background [vasa 
Conventional Goal: Convert SEU cross-sections (O¢,.y: 
cm2/particles) to error rates (A) for complex systems 


¢ Perform SEU accelerated radiation testing oe ore) oe 
across ions with different linear energy 
transfers (LETs) to calculate og-yS per 
LET. 


e Bottom-Up approach (transistor level): 


— Given Ogcy (per bit) use an error rate ie 
calculator (such as CREME96) to Eat Layer Lightly Dope: : fl 2 - 
obtain an error rate per bit (A, )- | 


— Multiply 4,,, by the dominant number 
of used memory bits (#UsedBits) in the 
target design to attain a system error 
rate (1 


). 
system 
¢ Top-Down approach (system level): 


e Given Og cy (per system) use an error 
rate calculator (Such as CREME96) to 


obtain an error rate per bit (A, tem): 


ET: Linear energy transfer 
Note Ac ba 


Substrate (Heavily Doped) - +4 + 
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Technical Problems with Current asa 
Methods of Error Rate Calculation 


¢ For submission to CREME96, O<cy 


data (across LET) are fitted toa 
Weibull curve. yeaa Osey Data versus LET 


— The two main parameters for curve 
fitting are a shape factor andaslope = 1.00E-02 - 


sy 
raetOr: — D 100E-03- 7 
— During the curve fitting process,a % 
large amount of error can be = 1.00E-04 | Data mimic wear- 
introduced. 7 . 5 1.00E-05 out portion of 
— Consequently, it is possible for - Weibull curve 
resultant error rates (for the same o 1.00E-06 - 
design) to vary by decades. 1.00E-07 


e Because of the error rate calculation 
: 1.00E-08 = 7 7 
process, O.-y data is blended 0.0 20.0 40.0 60.0 
together and it is nearly impossible LET MeV*cm2/mg 
to hone in on the problem spots. 
This can become important for 
mitigation insertion. 
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Technical Problems with Bottom-Up 
Analysis Method (1) 


e Multiplying each bit within a design by 4,,, is 
not an efficient method of system error rate 
prediction. 


— Works well with memory structures... 
but...complex systems do not operate like 
memories. 

— If an SEU affects a bit, and the bit is either 
inactive, disabled, or masked, a system 
malfunction might not occur. 

¢ Using the same multiplication factor 
across DFFs will produce extreme over- 
estimates. 

¢ To this date, there is no accurate A system < Avie Used Bits 
method to predict DFF activity for 
complex systems. 

¢ Fault injection or simulation will not 
determine frequency of activity. 
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Technical Problems with Bottom-Up 
Analysis Method (2) 


e There are a variety of components 
that are susceptible to SEUs 
(clocks, resets, combinatorial 
logic, flip-flops (DFFs, etc...)). : 

— Various component susceptibilities 
are not accurately characterized at 
a per bit level. 

— Design topology makes a 
significant difference in 
susceptibility and is not 
characterized in error rate 
calculators (e.g., CREME96). 


Error rates calculated at the transistor-bit level are 
estimated at too small of granularity for proper 
extrapolation to complex systems. 
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¢ Classical reliability 
models have been used 
as a standard metric for 
complex system 
performance. 


e¢ The analysis provides a 
more in depth 
interpretation of system 
behavior over time by 


using system-level MTTF 


data for system 
performance metrics. 


R@=e"™' or R(t)=e" 
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Weibull Failure Rate (A(T)) Bathtub 
Curve 


e |, 
E|- i 
—_ L 
ot! i 
c / 
= i Early Life 
. ‘ (failure rate decreases wi timel : 
wn % \WWearout Life —»+»- 
= . (failure rate increases wy time) F 

; / 
£) 
s| 4 
tr . 7 
2 . . ‘ 
= . Useful Life a 
i ‘ (failure rate approx. met “ 
LL * al 

* 
. ta 


Time (hours, miles, cycles, etc.) ; 
We will focus on the 


“Useful Life” of the 
Independent events bathtub curve for this 


analysis. 
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Mapping Classical Reliability Models from 
The Time Domain To The Fluence Domain 


The exponential model that relates reliability to MTTF 
assumes that during useful-lifetime: 


— Failures are independent. R(Q=e™"F or R(Q=e“ 
— Error rate is constant. Weibull slope <2 exponential. 
— MTTF = 1/1. , 

For a given LET (across fluence): Parallel between 
— SEUs are independent. time and fluence. 
— Ocpy IS constant. Oscy = #errors/fluence 
— MFTF = Io... A system = #errors/time 


Hence, mapping from the time domain to the fluence 
domain (per LET) is straight forward: 

—-to@® 

— MTTF © MFTF 

~h & egy R()=e ITF rm R(®)=e?MFIF 
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Creating Reliability Curves from o.<-yS Nasal 


1.00E-02 

° Osey data is system level. € 1.006-03 py - 

¢ A histogram of environment o i ooeua 
data is created. Bins are = poe 0s : 
determined by LET values at 5 piete : 
each O,-y data point. re 

¢ For each data point at a given b enseaae | | 
LET, a combination of binned 0.0 20.0 40.0 60.0 
environment data and upper- Bele ee 
bound o,-y data are used to 
determine system reliability 
performance. 


e A piecemeal approach is 
performed per data point to 
determine the weakest points ” §t00 Mile Al Shielding 
of system performance. 


Flux (#/cm?/day) > LET 


10° 10! 10° 
LET (MeV*cm2/mg) 
M. A. Xapsos, IEEE NSREC Short Course, Ponte Vedra 
Beach, FL, 2008. 
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Example of Proposed Methodology asa 


a Application 
e Mission requirements: 


— The FPGA shall contain an embedded microprocessor. 

— Selection shall be made between a Xilinx V5QV (very 
expensive device) or a Xilinx V5 with embedded PowerPC 
(relatively cheap device). 

— FPGA operation shall have reliability of 3-nines (99.9%) 
within a 10 minute window at Geosynchronous Equatorial 
Orbit (GEO). 

e Proposed methodology: 

— Create a histogram of particle flux versus LET for a 10- 
minute window of time for your target environment. 

— Calculate MFTF per LET (obtain SEU data). 

— Graph R(®) for a variety of LET values and their associated 
MFTEs. R(®)=e®™FTE 


— For selected ranges of LETs, use an upper bound of particle 
flux (number of particles/cm2*10-minutes), to determine if 
the system will meet the mission’s reliability requirements. 


To be presented by Melanie Berg at the NASA Electronics Parts and Packaging (NEPP) Electronics Technology Workshop (ETW), Greenbelt, MD, June26-29, 2017 15 


Flux versus LET Histogram for A 10- 
minute Window 
Geosynchronous Equatorial Orbit (GEO) 100-mils shielding 


aes Bins are selected based on o,,,, data 

@® 1.0E+02 7 

£ oints. 

2 1.0E+01 Pp We will analyze 
E 1.0E+00 system 


0T00.07 0.07T00.14 0.14T01.8 18T03.6 3.6T020 20T040 40and over 
LET Bins (MeVcm2/mg) 
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MFTF versus LET for the Xilinx V5 asa 
Embedded PowerPC Core and the Xilinx 
V5QV MicroBlaze Soft Processor Core 


1.00E+08 - 


° V5QV: no system errors MFTF = 1ocpy 
were observed below veou Meee 
LET=1.8MeVecm2/mg. enti plein iach 

with Cache Enabled 
Total fluence > 5.0x108 *% 
particles/cm2. * mV5: PowerPC 


1.00E+06- ss @ 
¢ PowerPC: 


— No system errors were 
observed below 
LET=0.07MeVecm2/mg 
with total fluence = 
1.0x10° particles/cm72. 


— Hence, at 0.07, we will jeoneies a ‘ 
assume an upper-bound ~ 
MFTF = 1.0x108 - 
particles/cm2. 1.00E+02 — : : 7 | 
— More tests would increase : - a io = ~~ 
the MFTF for this bin. sen Mevemmg 
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1.00E+05 - 


1.00E+04 | 


MFTF (particles/cm?2) 


Reliability across Fluence up to LET=0.07 
MeVecm2/mg -— Low Bound Analysis 


Binned GEO Environment data shows approximately 3000 
particles/(cm2e10-minutes), in the range of 0.0MeVecm2/mg to 
0.07MeVecm2/mg. We are using MFTF for 0.07MeVecm2/mg to upper 


bound this bin. 
1.000000E+00 - 


9.999800E-01 - aaa PowerPC: MFTF = 1.0x108 


9.999600E-01 - 
9.999400E-01 - R(@)=e1.0%10° 
9.999200E-01 - 
9.999000E-01 - 
9.998800E-01 - Used MFTF= 1.0x108 because that was the 

maximum fluence for tests (no errors observed) 


Reliability 


9.998600E-01 - 


9.998400E-01 TTTTITITITITITIIIITITITIITItIitiiririiiritrrirrirritirtiritiitirirtirritirirriiriritrriririritriitiitiitriiiitli 


0 1000 2000 3000 4000 5000 6000 7000 8000 9000 
Fluence (particles/cm?) 
Reliability at 3000 particles/(cm2*10-minutes) > 99.99% for the PowerPC 
design implementation. “9’s” could be increased with more tests. 
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Reliability across Fluence up to vasa 
LET=0.14MeVecm2/mg 


Binned GEO Environment data shows approximately 11 
particles/(cm2°10-minutes), in the range of 0.07MeVecm2/mg to 
0.14MeVecm2/mg. We are using MFTF for 0.1MeVecm2/mg to upper 
boundcbthie bun. 


9.999990E-01 _——™, ——=—PowerPC-V -5-0x105 
9.999980E-01 Oo 
en 


9.999950E-01 


Reliabilit 


9.999940E-01 Rf @)=e 
9.999930E-01 


9.999920E-01 
2.5 5 7.5 10 125 15 17.5 20 22.5 


oO 


Fluence (particles/cm?2) 


Reliability at 5 particles/(cm2*10-minutes) > 99.999% for the V5QV 
PowerPC design implementation. 
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Reliability across Fluence up to LET=1.8 vasa 
MeVecm2/mg 
Binned GEO Environment data shows approximately 9 
particles/(cm2e10-minutes), in the range of 0.14MeVecm2/mg to 
1.8MeVecm2/mg. We are using MFTF for 1.8MeVecm2/mg to upper 
bound thPs°oR Fro 


9.998000E-01 


PowerPC: M = 6.0x104 


> 9.997000E-01 Be 
5 9.996000E-01 ee 
© 9.995000E-01 ee 
9.9e4000c-01 We fall below 99.99% 
at approximately ere 


9.993000E-01 5particles/cm7! 


9.992000E-01 
0 4 8 12 16 20 24 28 


Fluence (particles/cm2) 
Reliability at 9 particles/(cm2*10-minutes) > 99.9% for the PowerPC 
design implementation. This is the most susceptible bin for the system. 
20 
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Reliability across Fluence up to asa 
LET=3.6MeVecm2/mg 


Binned GEO Environment data shows approximately 0.23 
particles/(cm2e10-minutes), in the range of 1.8MeVecm2/mg to 
3.6MeVecm2/mg. 


1.00000E+00 : 

——. == V5QV: MFTF= 2.5x10° 
9.99950E-01 PowerPC: MFTF = 1.2x102 
9.99900E-01 


4 po 
‘2 9.99850E-01 
xT) 
O& 9.99800E-01 
R(@)=e%/-5*10° 
9.99750E-01 R(@)=e9/1.2*10" 
9.99700E-01 
0 1 2 3 4 5 6 7 8 9 10 


Fluence (particle/cm2) 


Within this LET range, reliability at 0.23 particles/(cm2*10-minutes) 
> 99.999% for both design implementations. 
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Reliability across Fluence at vasa 
LET=40MeVcm2/mg 


Binned GEO environment data shows approximately 0.07 
particles/(cm2e10-minutes), in the range of 3.6MeVecm2/mg to 


40.0MeVecmz2iitre 
feet Naat ie see ee 


ran) 
7 0.9997 
® 
* 0.9996 + We Fall below 99.99% 
0.9995 at approximatel R (@d)= > /2.0* 104 
0.02particles/cm?! R(®) =e?" 102 
0.9994 


0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 
Fluence (particle/cm2) 
Within this LET range, reliability at 0.07 particles/(cm2e10-minutes) > 
99.9% for both design implementations. We can refine by analyzing 
smaller bins. 
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Example Conclusion wa 


Using the proposed methodology, the commercial Xilinx 
V5 device will meet project requirements. 


In this case, the project is able to save money by 


selecting the significantly cheaper FPGA device and gain 
performance because of the embedded PowerPC. 
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Conclusions asa 


e This study transforms proven classical reliability models into the 
SEU particle fluence domain. The intent is to better characterize SEU 
responses for complex systems. 

e The method for reliability-model application is as follows: 

— SEU data are obtained as MFTF. 

— Reliability curves (in the fluence domain) are calculated using 
MFTF; and are analyzed with a piecemeal approach. 

— Environment data are then used to determine particle flux 
exposure within required windows of mission operation. 

e The proposed method does not rely on data-fitting and hence 
removes a significant source of error. 


¢ The proposed method provides information for highly SEU- 
susceptible scenarios; hence enables a better choice of mitigation 
strategy. 


e This is preliminary work. There is more to come. 


This methodology expresses SEU behavior and response in terms that 
missions understand via classical reliability metrics. 
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